Towards A Differential Privacy and Utility Preserving Machine Learning Classifier

نویسندگان

  • Kato Mivule
  • Claude Turner
  • Soo-Yeon Ji
چکیده

Many organizations transact in large amounts of data often containing personal identifiable information (PII) and various confidential data. Such organizations are bound by state, federal, and international laws to ensure that the confidentiality of both individuals and sensitive data is not compromised. However, during the privacy preserving process, the utility of such datasets diminishes even while confidentiality is achieved--a problem that has been defined as NP-Hard. In this paper, we investigate a differential privacy machine learning ensemble classifier approach that seeks to preserve data privacy while maintaining an acceptable level of utility. The first step of the methodology applies a strong data privacy granting technique on a dataset using differential privacy. The resulting perturbed data is then passed through a machine learning ensemble classifier, which aims to reduce the classification error, or, equivalently, to increase utility. Then, the association between increasing the number of weak decision tree learners and data utility, which informs us as to whether the ensemble machine learner would classify more correctly is examined. As results, we found that a combined adjustment of the privacy granting noise parameters and an increase in the number of weak learners in the ensemble machine might lead to a lower classification error.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially Private Local Electricity Markets

Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...

متن کامل

The Large Margin Mechanism for Differentially Private Maximization

A basic problem in the design of privacy-preserving algorithms is the private maximization problem: the goal is to pick an item from a universe that (approximately) maximizes a data-dependent function, all under the constraint of differential privacy. This problem has been used as a sub-routine in many privacy-preserving algorithms for statistics and machine-learning. Previous algorithms for th...

متن کامل

Towards Measuring Membership Privacy

Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve i...

متن کامل

Dynamic Privacy For Distributed Machine Learning Over Network

Privacy-preserving distributed machine learning becomes increasingly important due to the recent rapid growth of data. This paper focuses on a class of regularized empirical risk minimization (ERM) machine learning problems, and develops two methods to provide differential privacy to distributed learning algorithms over a network. We first decentralize the learning algorithm using the alternati...

متن کامل

Practical Differential Privacy in High Dimensions

Privacy-preserving, and more concretely differentially private machine learning, is concerned with hiding specific details in training datasets which contain sensitive information. Many proposed differentially private machine learning algorithms have promising theoretical properties, such as convergence to non-private performance in the limit of infinite data, computational efficiency, and poly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012